Postgres
Querying Overview
Postgres is an open source SQL database. It supports array, JSONB, JSON, and vector data types and has several supporting query functions and query syntax extensions. Qarbine uses the Postgres node.js driver for interacting with the database.
A general reference for Postgres querying can be found at https://www.postgresql.org/docs/current/queries.html
Handling JSON Data
For information on JSON functions see https://www.postgresql.org/docs/current/functions-json.html
Consider a table defined as the following.
CREATE TABLE jproducts (
id SERIAL PRIMARY KEY,
data JSONB
)
The table is populated with this statement.
INSERT INTO jproducts (data)
VALUES ('{"name": "Laptop", "price": 1200, "specs": {"cpu": "Intel i7", "ram": "16GB"}}')
Traditional psql or legacy SQL tool output is shown below.
Notice the data column data is just a string.
In Qarbine, the query specification below retrieves all the rows.
SELECT * FROM jproducts
The results are shown below with the first row selected.
Notice the data column value is a real object. Conversions to JSON objects happen automatically. This enables much easier analytics than manually trying to deal with the simple string value.
Access to the result row values of interest in a template formula may use either
@current.data.name
or
#data.name
This example can be found at “example/Postgres/JSON products”.
Manipulating Row Shape
Qarbine can be directed to further modify the answer set elements This is done using ‘pragmas” which are discussed in the general Data Source Designer documentation. The most common pragma likely to be used is “pullFieldsUp CSV_list_of_fields”.
This answer set can be simplified by using Qarbine pragmas as shown below.
#pragma pullFieldsUp data
SELECT * FROM jproducts
The results are shown below with the first row selected.
The fields that were previously within the data field object have been pulled up a level. This makes it more convenient to reference field values in a template formula. Access to the result row values of interest in a template formula may now use either
@current.name
or
#name
This example can be found at “example/Postgres/JSON products with pragmas”.
Determining Postgres Extensions
To see what extensions are installed you can run the following query
SELECT extname, extversion
FROM pg_extension
A popular extension is one supporting vector queries which is discussed below. This example can be found at “example/Postgres/Installed extensions”.
PG Vector Querying
pgvector is an open-source extension for PostgreSQL that adds support for vector operations and similarity searches. For complete details see https://github.com/pgvector/pgvector.
Supported distance functions are listed in the table below.
Operator | Description |
---|---|
<-> | Euclidian L2 distance |
<#> | (negative) inner product |
<=> | cosine distance |
<+> | L1 distance (added in 0.7.0) |
<˜> | Hamming distance (binary vectors, added in 0.7.0) |
<%> | Jaccard distance (binary vectors, added in 0.7.0) |
Below is a nearest neighbor query.
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5
Get rows within a certain distance.
SELECT * FROM items WHERE embedding <-> '[3,1,2]' < 5
Qarbine can be configured to interact with popular LLM services such as Open AI, Azure Open AI, Google Gemini, and AWS Bedrock. These services may be called to dynamically obtain a vector by using the embedded(userInput) macro function. A general approach to doing this would use a macro function
Below is a nearest neighbor query with a dynamic vector.
SELECT * FROM items
ORDER BY embedding <-> '[! embedding(@userInput) !]'
LIMIT 5
Below is a query to get rows within a certain distance with a dynamic vector.
SELECT * FROM items
WHERE embedding <-> '[! embedding(@userInput) !]' < 5
Remember that the number of dimensions for the search vector and the stored vector embedding must match. The former is based on the AI Assistant as configured by the Qarbine administrator.
Below is an example of using dynamic vectors which obtains the embedding vector for the string “foo” from the AI Assistant withthe alias “aiGemini”..
SELECT *
FROM vitems
WHERE embedding <-> '[! embeddings("foo", "aiGemini” ) !]' < 5
Note the single quotes around the block macro expression which starts with “[!” and ends with “!]” to yield the required Postgres vector syntax.
Timestamp Handling
The Postgres function to_timestamp() can be used to convert a string into a timestamp. You can mix regular JSON functions with Qarbine pragmas to achieve various results.
Consider a table that has a JSONB column containing a field storing a timestamp as a string. A sample query is
select * from jstock
The results are shown below with a row selected.
This example can be found at “example/Postgres/JSON stock”.
The data the series was first published it within the item object as the firstPublished field. We can use the following pragma to perform the conversion in-place.
#pragma convertToDate item.firstPublished
select * from jstock
The pragmas are evaluated in order upon the answer set. We can combine pragmas such as shown below.
#pragma pullFieldsUp item
#pragma convertToDate firstPublished
select * from jstock
The results are shown below with a row selected.
The convertToDate argument has been modified to take into account the effect of the pragma above it.
This example can be found at “example/Postgres/JSON stock with pragmas”.
Postgres can extract the first published date from the JSON string and convert it into a real timestamp. An example is shown below.
#pragma pullFieldsUp item
SELECT
to_timestamp(item->>'firstPublished', 'DD-Mon-YYYY') AS first_published_timestamp,
item
FROM jstock
A sample row is shown below.
Notice that the timestamp ‘15-jan-2000’ shows up as an EST display string as that is the timezone of the browser.
This example can be found at “example/Postgres/JSON stock with firstPublished as timestamp”.
Troubleshooting
If errors occur while using Qarbine then a good course of action is to run the query using your traditional Postgres tools. This can range from psql or PGAdmin for example.